K-sparse encoder for efficient information retrieval
Annotation
Modern industrial search engines typically employ a two-stage pipeline: fast candidate retrieval followed by reranking. This approach inevitably leads to the loss of some relevant documents due to the simplicity of algorithms used in the first stage. This work proposes a single-stage approach that combines the advantages of dense semantic search models with the efficiency of inverted indices. The key component of the solution is a K-sparse encoder used to convert dense vectors into sparse ones compatible with inverted indices of the Lucene library. In contrast to the previously studied identifiable variational autoencoder, the proposed model is based on an autoencoder with a TopK activation function which explicitly enforces a fixed number of non-zero coordinates during training. This activation function makes the sparse vector generation process differentiable, eliminates the need for post-processing, and simplifies the loss function to a sum of reconstruction error and a component preserving relative distances between dense and sparse representations. The model was trained on a 300,000-document subset of the MS MARCO dataset using PyTorch and an NVIDIA L4 GPU. The proposed model achieves 96.6 % of the quality of the original dense model in terms of the NDCG@10 metric (0.57 vs. 0.59) on the SciFact dataset with 80 % sparsity. It is also shown that further increasing sparsity reduces index size and improves retrieval speed while maintaining acceptable search quality. In terms of memory usage, the approach outperforms the Hierarchical Navigable Small World (HNSW) graph-based algorithm, and at high sparsity levels, its speed approaches that of HNSW. The results confirm the applicability of the proposed approach to unstructured data retrieval. Direct control over sparsity enables balancing between search quality, latency, and memory requirements. Thanks to the use of an inverted index based on the Lucene library, the proposed solution is well suited for industrial- scale search systems. Future research directions include interpretability of the extracted features and improving retrieval quality under high sparsity conditions.
Keywords
Постоянный URL
Articles in current issue
- Apochromatic objective for imaging spectral systems of visible, near and short-wave infrared spectrum ranges
- Application of the cross-gain modulation in erbium-doped fiber to increase the effective spectral bandwidth of an interrogator
Nonlinear transmission of fluorophosphate glass with quantum dots of cadmium and lead sulfides and selenides under near-IR femtosecond laser irradiation
Methodology for estimation of sensitivity to vibration of optical components based on wavelet analysis of vibration-modulated radiation
- Characterization of Ar:N2 plasma mixture with optical emission spectroscopy during deposition of NbN coating
- Spectral diagnostics of Al-Ni alloys under laser irradiation: effect of laser energy on plasma parameters
- Application of anamorphic optics system and a high-speed line scan photodetector in an open-type relative encoder
- A structural study of N-(2-(2-(2-azidoethoxy)ethoxy)ethyl)-4,6-di(aziridin-1-yl)-1,3,5-triazin-2-amine by density functional theory calculations
- A method for generating digital avatar animation with speech and non-verbal synchronization based on bimodal data
- Leveraging machine learning for profiling IoT devices to identify malicious activities
- Font generation based on style and character structure analysis using diffusion models
- Anomaly detection under data scarcity and uncertainty using zero-shot and few- shot approaches
- The impact of adversarial attacks on a computer vision models perception of images
- Set intersection protocol with privacy preservation
- Comparative analysis method for time series data objects represented as sets of strings based on de Bruijn graphs
- Application of modern methods for information security risks evaluation of a critical information infrastructure facility
- Optimizing knowledge distillation models for language models
- Algorithm for human interaction with a model of an industrial cyber-physical system by means of neural interface
- An improved authentication protocol for self-driving vehicles based on Diffie–Hellman algorithm
- Simulation and analytical model of reliability with possible replication of transmissions in a reconfigurable multipath wireless network
- Evaluating tram positioning accuracy on curves based on map data and segmented images
- Building an optimal refueling plan using aggregated information about route parameter values from open sources
- Hermite–Gauss wavelets: synthesis of discrete forms and investigation of properties